Here we document the development Atlantis diagnostics code to determine whether the model is meeting define performance and review criteria. Code we be developed to evaluate the model output against detailed performance criteria developed in [1] and the calibration criteria in [2]:
All functional groups persist
Model stabilizes for the last ~20 years of an unfished, unperturbed 80-100 year run
Details for calibration:
Hindcast period established where we have survey/assessment time series with error bounds
Species groups totaling ~80% of system biomass should qualitatively match hindcast biomass trends. During calibration: reproduce historical trends in biomass when forcing the model with historical fishing and environmental drivers.
Patterns of temporal variability captured (emergent or forced with e.g. recruitment time series)
Productivity for groups totaling ~80% of system biomass should qualitatively match FMSY or life history expectations
Natural mortality decreases with age for majority of groups
Age and length structure qualitatively matches expectations for majority of groups
Diet predicted qualitatively matches empirical diet comp for majority of groups
An R function for each criterion is developed below, and a wrapper that runs all functions will be developed and tested here.
We use these R libraries, with non-CRAN package installation instructions in comments.
#remotes::install_github(\"noaa-edab/ecodata\",build_vignettes=TRUE)
#remotes::install_github(\"r4atlantis/atlantisom\",build_vignettes=TRUE)
#remotes::install_github(\"noaa-edab/ecotrend\",build_vignettes=TRUE)
#remotes::install_github("andybeet/atlantisdrive")
library(ecodata)
library(ecotrend)
library(tidyverse)
library(atlantisom)
library(here)
library(googledrive)
library(atlantisdrive)
source("shift_legend.R")
Configure files to be read in here (can have different files to source).
January 2021: source output from google folder EDABranch_Drive > Atlantis > Shared Data or EDABranch_Drive > Atlantis > Shared Data where each set of outputs in its own folder. Source input files from this github repo in folder here("currentVersion").
UPDATE: If files are pushed to google using atlantisdrive then all relevant input and output files will be in the same google folder and will download to the temp folder under diagnostics (or wherever the user directs it).
So the function below will set everything up using this DiagnosticConfig.R file, which loads all files using atlantisdrive:
# CHECK ALL FILENAMES FOR YOUR MODEL
# where are files on the atlantis google drive?
g.name <- "Testing/OutForSarah"
# which local directory should they write to?
d.name <- here::here("diagnostics", "temp")
# input file names
functional.groups.file <- "neus_groups.csv"
biomass.pools.file <- "neus_init.nc"
box.file <- "neus_tmerc_RM2.bgm"
initial.conditions.file <- "neus_init.nc"
fisheries.file <- "neus_fisheries.csv"
biol.prm.file <- "at_biology.prm"
# define the hindcast period--these are survey years
hindcast <- c(1980:2010)
The function diag_init takes the user’s config file as input, brings all Atlantis files into a local directory from google drive, and uses atlantisom functions to make a list of object including species names, functional group attributes, model boxes, run parameters, and biological parameters, and the biomass and catch time series outputs from .txt files. More can be added to this as needed.
diag_init <- function(config_file){
source(config_file)
# pull all files from google
atlantisdrive::pull_from_drive(d.name,
fileList = NULL,
googledriveFolder = g.name)
# get scenario name from a file inside folder: SSB is unique
scenario.name <- gsub("SSB.txt","",list.files(path=d.name, pattern = "*SSB.txt"))
# output file names
run.prm.file <- list.files(path=d.name, pattern = "*at_run*.xml")
bioind.file <- paste0(scenario.name, "BiomIndx.txt")
catch.file <- paste0(scenario.name, "Catch.txt")
#Load functional groups
fgs <- atlantisom::load_fgs(dir=d.name,
file_fgs = functional.groups.file)
#Get just the names of active functional groups
fgs.names <- fgs %>%
dplyr::filter(IsTurnedOn == 1) %>%
dplyr::select(Name) %>%
.$Name
# should return all model areas
boxpars <- atlantisom::load_box(d.name, box.file)
boxall <- c(0:(boxpars$nbox - 1))
# generalized timesteps all models
runpar <- atlantisom::load_runprm(d.name, run.prm.file)
noutsteps <- runpar$tstop/runpar$outputstep
stepperyr <- if(runpar$outputstepunit=="days") 365/runpar$toutinc
timeall <- c(0:noutsteps)
# learned the hard way this can be different from ecosystem outputs
fstepperyr <- if(runpar$outputstepunit=="days") 365/runpar$toutfinc
# load the biomass index results
atBtxt <- atlantisom::load_bioind(d.name, bioind.file, fgs)
#load the catch results
atCtxt <- atlantisom::load_catch(d.name, catch.file, fgs)
# load YOY
YOY <- atlantisom::load_yoy(d.name, paste0(scenario.name, "YOY.txt"))
# load biol_prm
biol <- atlantisom::load_biolprm(d.name, biol.prm.file)
diaglist <-list("fgs" = fgs,
"fgs.names" = fgs.names,
"atBtxt" = atBtxt,
"atCtxt" = atCtxt,
"runpar" = runpar,
"noutsteps" = noutsteps,
"stepperyr" = stepperyr,
"timeall" = timeall,
"fstepperyr" = fstepperyr,
"boxpars" = boxpars,
"boxall" = boxall,
"YOY" = YOY,
"biol" = biol)
return(diaglist)
}
Usage: this produces the list object testdiag to be used in subsequent functions.
# clear the local model output directory for safety if needed
#f <- list.files(d.name, include.dirs = F, full.names = T, recursive = T)
#file.remove(f)
testdiag <- diag_init("DiagnosticConfig.R")
## Using an auto-discovered, cached token.
## To suppress this message, modify your code or options to clearly consent to the use of a cached token.
## See gargle's "Non-interactive auth" vignette for more details:
## https://gargle.r-lib.org/articles/non-interactive-auth.html
## The googledrive package is using a cached token for sarah.gaichas@noaa.gov.
Now we define one function for each diagnostic.
This should be run on an unfished, unperturbed run. Fishing or perturbations may legitimately drive groups extinct. However, for our historical period, we don’t expect anything to go extinct that is in the system now, so this test includes historical fishing, physics, and phytoplankton drivers.
This test assumes that initial (Time = 0) biomass does not count, and averages over a year to see if mean B is 0. Output is a list of species that have crashed and an optional set of plots where 0 biomass is red. The function diag_persist is called with the atBtxt and fgs.names output of the initialization function. Plots can be turned off with plot=FALSE added to the function call.
diag_persist <- function(atBtxt, fgs.names, plot=TRUE){
# need in annual units? Or fail when any output timestep below threshold?
# make safe for migratory species, assume that over the course of the year mean B > 0.
# assumes biomass never goes negative in atlantis
crash <- atBtxt %>%
filter(species %in% fgs.names) %>%
mutate(yr = ceiling(time/365)) %>%
filter(yr > 0) %>%
group_by(species, yr) %>%
summarise(meanB = mean(atoutput)) %>%
filter(meanB == 0)
# flag any groups with any mean annual biomass below a threshold
crashed <- crash %>%
group_by(species) %>%
count()
names(crashed) <- c("Failed", "nYrs0B")
# visualize; hardcoded pages for ~89 group NEUS model
if(plot){
atBtxt$col <- cut(atBtxt$atoutput,
breaks = c(-Inf, 0, Inf),
labels = c("crashed", ">0 B"))
plotB <-ggplot() +
geom_line(data=atBtxt%>%filter(species %in% fgs.names),
aes(x=time/365,y=atoutput, color=col),
alpha = 10/10) +
ggthemes::theme_tufte() +
theme(legend.position = "top") +
labs(colour=g.name)
print(plotB + ggforce::facet_wrap_paginate(~species, ncol=4, nrow = 3, page = 1, scales="free"))
print(plotB + ggforce::facet_wrap_paginate(~species, ncol=4, nrow = 3, page = 2, scales="free"))
print(plotB + ggforce::facet_wrap_paginate(~species, ncol=4, nrow = 3, page = 3, scales="free"))
print(plotB + ggforce::facet_wrap_paginate(~species, ncol=4, nrow = 3, page = 4, scales="free"))
print(plotB + ggforce::facet_wrap_paginate(~species, ncol=4, nrow = 3, page = 5, scales="free"))
print(plotB + ggforce::facet_wrap_paginate(~species, ncol=4, nrow = 3, page = 6, scales="free"))
print(plotB + ggforce::facet_wrap_paginate(~species, ncol=4, nrow = 3, page = 7, scales="free"))
print(plotB + ggforce::facet_wrap_paginate(~species, ncol=4, nrow = 3, page = 8, scales="free"))
}
return(as.data.frame(crashed))
}
Usage, with output:
diag_persist(testdiag$atBtxt, testdiag$fgs.names)
## Failed nYrs0B
## 1 Carrion 55
## 2 Drums_Croakers 1
## 3 Menhaden 45
## 4 Summerflounder 8
This test determines whether the model reaches steady state for last ~20 years of an unperturbed, unfished ~100 year run. I don’t have a 100 year run right now so we will test with the 55 year run and look at the last 20 years, with the understanding that the test should use a different model run.
This test evaluates the last 20 years of the input run for a significant slope using the geom_gls function from ecodata. The function diag_stable also checks for persistence and flags groups that have gone to 0 biomass–they don’t count as stable. It takes the same inputs as diag_persist with the addition of the runpar object, and defaults to plotting, which can be turned off with plot=FALSE in the function call.
diag_stable <- function(atBtxt, fgs.names, runpar, plot=TRUE){
# look for non-significant slope over last 20 years? 30 years would be better
nlast <- 20
startlast <- floor(runpar$nyears)-nlast
#warning, was getting different behavior with this code on my linux machine
stable <- atBtxt%>%filter(species %in% fgs.names) %>%
filter(time %in% seq(startlast*365, floor(runpar$nyears)*365, by=365)) %>%
group_by(species) %>%
ggplot(aes(x=time/365, y=atoutput)) +
ggthemes::theme_tufte() +
geom_line(data=atBtxt%>%filter(species %in% fgs.names),
aes(x=time/365, y=atoutput),
alpha = 5/10) +
geom_gls(warn = FALSE)
print(stable + ggforce::facet_wrap_paginate(~species, ncol=4, nrow = 3, page = 1, scales="free"))
print(stable + ggforce::facet_wrap_paginate(~species, ncol=4, nrow = 3, page = 2, scales="free"))
print(stable + ggforce::facet_wrap_paginate(~species, ncol=4, nrow = 3, page = 3, scales="free"))
print(stable + ggforce::facet_wrap_paginate(~species, ncol=4, nrow = 3, page = 4, scales="free"))
print(stable + ggforce::facet_wrap_paginate(~species, ncol=4, nrow = 3, page = 5, scales="free"))
print(stable + ggforce::facet_wrap_paginate(~species, ncol=4, nrow = 3, page = 6, scales="free"))
print(stable + ggforce::facet_wrap_paginate(~species, ncol=4, nrow = 3, page = 7, scales="free"))
print(stable + ggforce::facet_wrap_paginate(~species, ncol=4, nrow = 3, page = 8, scales="free"))
# what I need is to extract the output of geom_gls that is NULL (no significant trend)
# this may work eventually but blows up R right now, fix later
# # get results from ecotrend package
# stabletest <- atBtxt %>%
# filter(species %in% fgs.names) %>%
# #mutate(yr = ceiling(time/365)) %>%
# filter(time %in% seq(startlast*365, floor(runpar$nyears)*365, by=365)) %>%
# group_by(species) %>%
# dplyr::do(ecotrend::glsMs(data = ., formula = time ~ atoutput))
#
#ecotrend::glsMs(data = ., formula = yr ~ biomass)
#ecotrend::glsMs(yr ~ biomass, stabletest)
#test this run for persistence too
# flag any groups with any mean annual biomass below a threshold
}
Usage:
diag_stable(testdiag$atBtxt, testdiag$fgs.names, testdiag$runpar)
## Warning: Computation failed in `stat_gls()`:
## computed "gls" fit is singular, rank 1
## Warning: Computation failed in `stat_gls()`:
## computed "gls" fit is singular, rank 1
## Warning: Computation failed in `stat_gls()`:
## computed "gls" fit is singular, rank 1
## Warning: Computation failed in `stat_gls()`:
## computed "gls" fit is singular, rank 1
## Warning: Computation failed in `stat_gls()`:
## computed "gls" fit is singular, rank 1
## Warning: Computation failed in `stat_gls()`:
## computed "gls" fit is singular, rank 1
## Warning: Computation failed in `stat_gls()`:
## computed "gls" fit is singular, rank 1
## Warning: Computation failed in `stat_gls()`:
## computed "gls" fit is singular, rank 1
## Warning: Computation failed in `stat_gls()`:
## computed "gls" fit is singular, rank 1
## Warning: Computation failed in `stat_gls()`:
## computed "gls" fit is singular, rank 1
## Warning: Computation failed in `stat_gls()`:
## computed "gls" fit is singular, rank 1
## Warning: Computation failed in `stat_gls()`:
## computed "gls" fit is singular, rank 1
## Warning: Computation failed in `stat_gls()`:
## computed "gls" fit is singular, rank 1
## Warning: Computation failed in `stat_gls()`:
## computed "gls" fit is singular, rank 1
## Warning: Computation failed in `stat_gls()`:
## computed "gls" fit is singular, rank 1
## Warning: Computation failed in `stat_gls()`:
## computed "gls" fit is singular, rank 1
## Warning: Computation failed in `stat_gls()`:
## computed "gls" fit is singular, rank 1
## Warning: Computation failed in `stat_gls()`:
## computed "gls" fit is singular, rank 1
## Warning: Computation failed in `stat_gls()`:
## computed "gls" fit is singular, rank 1
## Warning: Computation failed in `stat_gls()`:
## computed "gls" fit is singular, rank 1
## Warning: Computation failed in `stat_gls()`:
## computed "gls" fit is singular, rank 1
## Warning: Computation failed in `stat_gls()`:
## computed "gls" fit is singular, rank 1
## Warning: Computation failed in `stat_gls()`:
## computed "gls" fit is singular, rank 1
## Warning: Computation failed in `stat_gls()`:
## computed "gls" fit is singular, rank 1
## Warning: Computation failed in `stat_gls()`:
## computed "gls" fit is singular, rank 1
## Warning: Computation failed in `stat_gls()`:
## computed "gls" fit is singular, rank 1
## Warning: Computation failed in `stat_gls()`:
## computed "gls" fit is singular, rank 1
## Warning: Computation failed in `stat_gls()`:
## computed "gls" fit is singular, rank 1
## Warning: Computation failed in `stat_gls()`:
## computed "gls" fit is singular, rank 1
## Warning: Computation failed in `stat_gls()`:
## computed "gls" fit is singular, rank 1
## Warning: Computation failed in `stat_gls()`:
## computed "gls" fit is singular, rank 1
## Warning: Computation failed in `stat_gls()`:
## computed "gls" fit is singular, rank 1
## Warning: Computation failed in `stat_gls()`:
## computed "gls" fit is singular, rank 1
## Warning: Computation failed in `stat_gls()`:
## computed "gls" fit is singular, rank 1
## Warning: Computation failed in `stat_gls()`:
## computed "gls" fit is singular, rank 1
## Warning: Computation failed in `stat_gls()`:
## computed "gls" fit is singular, rank 1
## Warning: Computation failed in `stat_gls()`:
## computed "gls" fit is singular, rank 1
## Warning: Computation failed in `stat_gls()`:
## computed "gls" fit is singular, rank 1
## Warning: Computation failed in `stat_gls()`:
## computed "gls" fit is singular, rank 1
## Warning: Computation failed in `stat_gls()`:
## computed "gls" fit is singular, rank 1
## Warning: Computation failed in `stat_gls()`:
## computed "gls" fit is singular, rank 1
## Warning: Computation failed in `stat_gls()`:
## computed "gls" fit is singular, rank 1
## Warning: Computation failed in `stat_gls()`:
## computed "gls" fit is singular, rank 1
## Warning: Computation failed in `stat_gls()`:
## computed "gls" fit is singular, rank 1
## Warning: Computation failed in `stat_gls()`:
## computed "gls" fit is singular, rank 1
## Warning: Computation failed in `stat_gls()`:
## computed "gls" fit is singular, rank 1
## Warning: Computation failed in `stat_gls()`:
## computed "gls" fit is singular, rank 1
## Warning: Computation failed in `stat_gls()`:
## computed "gls" fit is singular, rank 1
# This has now been moved to the DiagnosticConfig.R file and uses atlantisdrive
# set up files to be read in
# the below can go into a config file to be sourced
# input files directly from atlantis-neus repo
functional.groups.file <- "neus_groups.csv"
biomass.pools.file <- "neus_init.nc"
box.file <- "neus_tmerc_RM2.bgm"
initial.conditions.file <- "neus_init.nc"
fisheries.file <- "neus_fisheries.csv"
biol.prm.file <- "at_biology.prm"
# outputs from gdrive
# location of files, have user pass to diag_init function
g.name <- "https://drive.google.com/drive/folders/1CYLbZlrbjRnXQODeQKb5xUGo6Ry3DC23" # Master_20201020 gdrive folder name
#g.name <- "https://drive.google.com/drive/folders/1CECkhH80QqGQiHRKG-L7SW1v27VckhWD" # Output_11_13_2020_Rob
#get scenario name from a file inside folder: SSB is unique
scenario.name <- gsub("SSB.txt","",(drive_ls(path=g.name, pattern = "*SSB.txt")$name))
d.name <- here("diagnostics", "temp")
# need to download needed files to a temp folder because atlantisom functions dont read gdrive directly
run.prm.file <- drive_ls(path=g.name, pattern = "*at_run*.xml")$name
bioind.file <- paste0(scenario.name, "BiomIndx.txt")
catch.file <- paste0(scenario.name, "Catch.txt")
# use Andy's atlantisdrive functions instead, wrote this before
drive_download(file = drive_ls(path=g.name, pattern = "*at_run*.xml"),
path = here("diagnostics", "temp", run.prm.file),
type = NULL,
overwrite = TRUE,
verbose = TRUE
)
drive_download(file = drive_ls(path=g.name, pattern = paste0(scenario.name, "BiomIndx.txt")),
path = here("diagnostics", "temp", bioind.file),
type = NULL,
overwrite = TRUE,
verbose = TRUE
)
drive_download(file = drive_ls(path=g.name, pattern = paste0(scenario.name, "Catch.txt")),
path = here("diagnostics", "temp", catch.file),
type = NULL,
overwrite = TRUE,
verbose = TRUE
)
Get functional group names for other functions
#Load functional groups
funct.groups <- load_fgs(dir=here("currentVersion"),
file_fgs = functional.groups.file)
#Get just the names of active functional groups
funct.group.names <- funct.groups %>%
filter(IsTurnedOn == 1) %>%
select(Name) %>%
.$Name
Get basic output parameters
# should return all model areas
boxpars <- load_box(here("currentVersion"), box.file)
boxall <- c(0:(boxpars$nbox - 1))
# generalized timesteps all models
runpar <- load_runprm(d.name, run.prm.file)
noutsteps <- runpar$tstop/runpar$outputstep
stepperyr <- if(runpar$outputstepunit=="days") 365/runpar$toutinc
midptyr <- round(median(seq(0,stepperyr)))
# a survey that takes place once per year mid year
annualmidyear <- seq(midptyr, noutsteps, stepperyr)
timeall <- c(0:noutsteps)
# learned the hard way this can be different from ecosystem outputs
fstepperyr <- if(runpar$outputstepunit=="days") 365/runpar$toutfinc
This should be run on an unfished, unperturbed run. Fishing or perturbations may legitimately drive groups extinct.
# read the output atlantis biom.txt file
atBtxt <- load_bioind(d.name, bioind.file, funct.groups)
# visualize; hardcoded pages for ~89 group model
plotB <-ggplot() +
geom_line(data=atBtxt, aes(x=time/365,y=atoutput, color="txt output B"),
alpha = 10/10) +
ggthemes::theme_tufte() +
theme(legend.position = "top") +
labs(colour=scenario.name)
plotB + ggforce::facet_wrap_paginate(~species, ncol=4, nrow = 4, page = 1, scales="free")
plotB + ggforce::facet_wrap_paginate(~species, ncol=4, nrow = 4, page = 2, scales="free")
plotB + ggforce::facet_wrap_paginate(~species, ncol=4, nrow = 4, page = 3, scales="free")
plotB + ggforce::facet_wrap_paginate(~species, ncol=4, nrow = 4, page = 4, scales="free")
plotB + ggforce::facet_wrap_paginate(~species, ncol=4, nrow = 4, page = 5, scales="free")
plotB + ggforce::facet_wrap_paginate(~species, ncol=4, nrow = 4, page = 6, scales="free")
# need in annual units? Or fail when any output timestep below threshold?
# make safe for migratory species, assume that over the course of the year mean B > 0.
# assumes biomass never goes negative in atlantis
crash <- atBtxt %>%
group_by(species, time) %>%
summarise(meanB = mean(atoutput)) %>%
filter(meanB < 1e-4)
# flag any groups with any mean annual biomass below a threshold
unique(crash$species)
Model reaches steady state for last ~20 years of an unperturbed, unfished ~100 year run. Things at 0 shouldn’t count, so run persistence test as well. For plotting, let’s skip the first 20 model years so we only look after spinup.
scenario.name2 <- "atneus_v15_test2008hydro_20180208"
funct.groups <- testdiag$fgs
# read in long run from separate folder
longatBtxt <- read.table(file.path(here("diagnostics", "testfiles", "100_year_no_fishing"), paste0(scenario.name2, "BiomIndx.txt")), header=T)
groupslookup <- funct.groups %>% #assumes we have the same functional groups in this run as the shorter ones
filter(IsTurnedOn > 0)
# read in run pars for the long run too
longrunpar <- load_runprm(here("diagnostics", "testfiles", "100_year_no_fishing"), run.prm.file2) #assumes same name as above, it is
longnoutsteps <- longrunpar$tstop/longrunpar$outputstep
longtimeall <- c(0:longnoutsteps)
longatBtxttidy <- longatBtxt %>%
select(Time:DIN) %>%
rename_(.dots=with(groupslookup, setNames(as.list(as.character(Code)), Name))) %>%
gather(species, biomass, -Time) %>%
mutate(yr = ceiling(Time/365)) %>%
filter(Time %in% seq(20*365, floor(longrunpar$nyears)*365, by=365))
# look for non-significant slope over last 20 years? 30 years would be better
nlast <- 20
startlast <- floor(longrunpar$nyears)-nlast
#warning, was getting different behavior with this code on my linux machine
stable <- longatBtxttidy %>%
filter(Time %in% seq(startlast*365, floor(longrunpar$nyears)*365, by=365)) %>%
group_by(species) %>%
ggplot(aes(x=yr, y=biomass)) +
ggthemes::theme_tufte() +
geom_line(data=longatBtxttidy, aes(x=yr, y=biomass)) +
geom_gls(warn = FALSE)
stable + ggforce::facet_wrap_paginate(~species, ncol=4, nrow = 4, page = 1, scales="free")
stable + ggforce::facet_wrap_paginate(~species, ncol=4, nrow = 4, page = 2, scales="free")
stable + ggforce::facet_wrap_paginate(~species, ncol=4, nrow = 4, page = 3, scales="free")
stable + ggforce::facet_wrap_paginate(~species, ncol=4, nrow = 4, page = 4, scales="free")
stable + ggforce::facet_wrap_paginate(~species, ncol=4, nrow = 4, page = 5, scales="free")
stable + ggforce::facet_wrap_paginate(~species, ncol=4, nrow = 4, page = 6, scales="free")
# what I need is to extract the output of geom_gls that is NULL (no significant trend)
# this may work eventually but blows up R right now, fix later
# # get results from ecotrend package
# stabletest <- longatBtxttidy %>%
# filter(Time %in% seq(startlast*365, floor(longrunpar$nyears)*365, by=365)) %>%
# group_by(species) %>%
# dplyr::do(ecotrend::glsMs(data = ., formula = yr ~ biomass))
#
# #ecotrend::glsMs(data = ., formula = yr ~ biomass)
#
# #ecotrend::glsMs(yr ~ biomass, stabletest)
#test this run for persistence too
longcrash <- longatBtxt %>%
select(Time:DIN) %>%
rename_(.dots=with(groupslookup, setNames(as.list(as.character(Code)), Name))) %>%
gather(species, biomass, -Time) %>%
mutate(yr = ceiling(Time/365)) %>%
group_by(species, yr) %>%
summarise(meanB = mean(biomass)) %>%
filter(meanB < 1e-4)
# flag any groups with any mean annual biomass below a threshold
unique(longcrash$species)
The period is defined as 1980-2010 for trend comparison. Change it here to see different results:
hindcast <- c(1980:2010)
Data sources include ecodata and stock assessment outputs for major species.
Not all species need to match, we will first determine the most important subset(s). Based on discussion, we want to track the species the comprise 80% of biomass (and 80% of catch and 80% of revenue).
Code to evaluate which species are most important based on biomass, catch, and revenue, union and intersection.
# first past B, catch, revenue, use ecodata
survbio <- ecodata::nefsc_survey_disaggregated %>%
filter(Time %in% hindcast) %>%
group_by(comname) %>%
summarize(avgkgtow = mean(kg.per.tow, na.rm = T)) %>%
mutate(prop = avgkgtow/sum(avgkgtow, na.rm = T)) %>%
arrange(desc(prop))
hibiosp <- stringr::str_to_sentence(survbio$comname[cumsum(survbio$prop)<0.81 & !is.na(cumsum(survbio$prop))])
# comdat and bennet are already aggregated in ecodata
# need comland, should be able to run similar code as above to get catch and revenue
# NOTE: THIS FILE NOT TO BE POSTED ON GITHUB DUE TO POTENTIAL CONFIDENTIALITY CONCERNS
# Use the comlandr package to create the comland data set or ask Sean for it
comland <- get(load(here::here("diagnostics", "comland_meatwt_deflated_stat_areas.RData")))
# from https://github.com/NOAA-EDAB/Atlantis-Catch-Files/blob/master/Atlantis_1_5_groups_svspp_nespp3.csv
spcodes <- as.data.frame(readr::read_csv(here::here("diagnostics","Atlantis_1_5_groups_svspp_nespp3.csv")))
# time series in case we want to plot them
comlandts <- merge(comland, spcodes, by = "NESPP3") %>%
filter(YEAR %in% hindcast) %>%
group_by(YEAR, Name) %>%
summarise_at(vars(SPPLIVMT, SPPVALUE), sum)
comlandat <- comlandts %>%
group_by(Name) %>%
summarize_at(vars(SPPLIVMT, SPPVALUE), mean, na.rm = T) %>%
rename(avSPPLIVMT = SPPLIVMT, avSPPVALUE = SPPVALUE) %>%
mutate(propcatch = avSPPLIVMT/sum(avSPPLIVMT, na.rm = T)) %>%
mutate(propvalue = avSPPVALUE/sum(avSPPVALUE, na.rm = T)) %>%
arrange(desc(propcatch))
hicatchsp <- comlandat$Name[cumsum(comlandat$propcatch) <0.81 & !is.na(cumsum(comlandat$propcatch))]
# danger, has to be sorted to be correct
comlandatv <- arrange(comlandat, desc(propvalue))
hivalsp <- comlandatv$Name[cumsum(comlandatv$propvalue) <0.81 & !is.na(cumsum(comlandatv$propvalue))]
# add recreational catch and rec value/n participants?
# WARNING our names dont match between survbio comnames and comland Name, fix here otherwise have duplicates
# union all
hibiocatvalsp <- sort(unique(c(hibiosp, hicatchsp, hivalsp)))
Species comprising 80% of survey kg per tow averaged over 1980 to 2010:
Spiny dogfish, Winter skate, Acadian redfish, Haddock, Atlantic cod, Little skate, Atlantic herring, Longfin squid, Silver hake, Smooth dogfish, Butterfish, Pollock, Weakfish, Winter flounder
Species comprising 80% of commercial landings averaged over the same period:
Menhaden, Atlantic herring, Lobster, Atlantic surf clam, Blue crab, Atlantic cod, Ocean quahog, Goosefish, Sea scallop, Atlantic mackerel, Loligo squid, Silver hake, Illex squid
Species comprising 80% of commercial value averaged over the same period:
Lobster, Sea scallop, Blue crab, Atlantic cod, Atlantic surf clam, Atlantic salmon, Summer flounder, Goosefish, Ocean quahog, Menhaden, Loligo squid, Yellowtail flounder, Winter flounder, Haddock
A list with all of these species:
Acadian redfish, Atlantic cod, Atlantic herring, Atlantic mackerel, Atlantic salmon, Atlantic surf clam, Blue crab, Butterfish, Goosefish, Haddock, Illex squid, Little skate, Lobster, Loligo squid, Longfin squid, Menhaden, Ocean quahog, Pollock, Sea scallop, Silver hake, Smooth dogfish, Spiny dogfish, Summer flounder, Weakfish, Winter flounder, Winter skate, Yellowtail flounder
Code to develop the hindcast comparison (call it “reference”) dataset:
# simplest is annual output comparison, but which season should be compared to Atlantis?
# WARNING our names dont match between survbio comnames and comland Name, fix above
survbio.ts <- ecodata::nefsc_survey_disaggregated %>%
filter(Time %in% hindcast) %>%
mutate(comname = stringr::str_to_sentence(comname)) %>%
filter(comname %in% hibiocatvalsp) %>%
group_by(Time, Season, comname) %>%
summarise(annkgtow = sum(kg.per.tow))
tsplot <- ggplot(survbio.ts, aes(Time, annkgtow, color=Season)) + geom_point() + ggthemes::theme_tufte() + facet_wrap(~comname, scales = "free")
grid.draw(shift_legend(tsplot))
## Warning: Removed 512 rows containing missing values (geom_point).
# ecosystem indicators for comparison: total survey biomass
survbio.tot <- ecodata::nefsc_survey_disaggregated %>%
filter(Time %in% hindcast) %>%
group_by(Time, Season) %>%
summarise(annkgtow = sum(kg.per.tow, na.rm = TRUE))
ggplot(survbio.tot, aes(Time, annkgtow)) + geom_point() + ggthemes::theme_tufte() + facet_wrap(~Season)
Key species to match and their most representative trend datasets are determined in analyses above, and the code that looks for the match between this reference set and Atlantis output is developed here. By match we mean significance and direction only.
# assumes biom.txt output already read in
1. Kaplan IC, Marshall KN. A guinea pig’s tale: Learning to review end-to-end marine ecosystem models for management applications. ICES Journal of Marine Science. 2016;73: 1715–1724. doi:10.1093/icesjms/fsw047
2. Pethybridge HR, Weijerman M, Perrymann H, Audzijonyte A, Porobic J, McGregor V, et al. Calibrating process-based marine ecosystem models: An example case using Atlantis. Ecological Modelling. 2019;412: 108822. doi:10.1016/j.ecolmodel.2019.108822